The Lagrangian
We consider an optimization problem in the standard form:
minimize subject to f0(x)fi(x)≤0,i=1,…,mhi(x)=0,i=1,…,p with variable x∈Rn. We assume its domain D=⋂i=0mdomfi∩⋂i=1pdomhi is nonempty, and denote the optimal value by p⋆. We do not assume the problem is convex.
The basic idea in Lagrangian duality is to take the constraints into account by augmenting the objective function with a weighted sum of the constraint functions. We define the Lagrangian L:Rn×Rm×Rp→R as
L(x,λ,ν)=f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x) with domL=D×Rm×Rp. We refer to λi as the Lagrange multiplier associated with the i th inequality constraint fi(x)≤0; similarly we refer to νi as the Lagrange multiplier associated with the i th equality constraint hi(x)=0. The vectors λ and ν are called the dual variables or Lagrange multiplier vectors.
The Lagrange dual function
We define the Lagrange dual function (or just dual function) g:Rm×Rp→R as the minimum value of the Lagrangian over x : for λ∈Rm,ν∈Rp,
g(λ,ν)=x∈DinfL(x,λ,ν)=x∈Dinf(f0(x)+i=1∑mλifi(x)+i=1∑pνihi(x)). When the Lagrangian is unbounded below in x, the dual function takes on the value −∞. Since the dual function is the pointwise infimum of a family of affine functions of (λ,ν), it is concave, even when the problem is not convex.
Lower bounds on optimal value
The dual function yields lower bounds on the optimal value p⋆ of the problem: For any λ⪰0 and any ν we have
g(λ,ν)≤p⋆. This important property is easily verified. Suppose x~ is a feasible point, i.e., fi(x~)≤0 and hi(x~)=0, and λ⪰0. Then we have
i=1∑mλifi(x~)+i=1∑pνihi(x~)≤0, since each term in the first sum is nonpositive, and each term in the second sum is zero, and therefore
L(x~,λ,ν)=f0(x~)+i=1∑mλifi(x~)+i=1∑pνihi(x~)≤f0(x~). Hence
g(λ,ν)=x∈DinfL(x,λ,ν)≤L(x~,λ,ν)≤f0(x~). Since g(λ,ν)≤f0(x~) holds for every feasible point x~, the inequality follows.
The inequality holds, but is vacuous, when g(λ,ν)=−∞. The dual function gives a nontrivial lower bound on p⋆ only when λ⪰0 and (λ,ν)∈domg, i.e., g(λ,ν)>−∞. We refer to a pair (λ,ν) with λ⪰0 and (λ,ν)∈domg as dual feasible, for reasons that will become clear later.
The Lagrange Dual Problem
For each pair (λ,ν) with λ⪰0, the Lagrange dual function gives us a lower bound on the optimal value p⋆ of the optimization problem. Thus we have a lower bound that depends on some parameters λ,ν. A natural question is: What is the best lower bound that can be obtained from the Lagrange dual function?
This leads to the optimization problem
maximize subject to g(λ,ν)λ⪰0. This problem is called the Lagrange dual problem. In this context the original problem is sometimes called the primal problem. The term dual feasible, to describe a pair (λ,ν) with λ⪰0 and g(λ,ν)>−∞, now makes sense. It means, as the name implies, that (λ,ν) is feasible for the dual problem. We refer to (λ⋆,ν⋆) as dual optimal or optimal Lagrange multipliers if they are optimal for the dual problem.
The Lagrange dual problem is a convex optimization problem, since the objective to be maximized is concave and the constraint is convex. This is the case whether or not the primal problem is convex.
In a similar way we can find the Lagrange dual problem of a linear program in inequality form
minimize subject to cTxAx⪯b The Lagrangian is
L(x,λ)=cTx+λT(Ax−b)=−bTλ+(ATλ+c)Tx so the dual function is
g(λ)=xinfL(x,λ)=−bTλ+xinf(ATλ+c)Tx. The infimum of a linear function is −∞, except in the special case when it is identically zero, so the dual function is
g(λ)={−bTλ−∞ATλ+c=0 otherwise The dual variable λ is dual feasible if λ⪰0 and ATλ+c=0.
The Lagrange dual of the LP is to maximize g over all λ⪰0. Again we can reformulate this by explicitly including the dual feasibility conditions as constraints, as in
maximize subject to −bTλATλ+c=0λ⪰0 which is an LP in standard form.
Note the interesting symmetry between the standard and inequality form LPs and their duals: The dual of a standard form LP is an LP with only inequality constraints, and vice versa.
Reference
Convex Optimization, Stephen Boyd, Lieven Vandenberghe